{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial 04: Visualizing Experiment Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "This tutorial describes the process of visualizing the results of Flow experiments, and of replaying them. \n",
    "\n",
    "**Note:** This tutorial is only relevant if you use SUMO as a simulator. We currently do not support policy replay nor data collection when using Aimsun. The only exception is for reward plotting, which is independent on whether you have used SUMO or Aimsun during training."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 1. Visualization components"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "The visualization of simulation results breaks down into three main components:\n",
    "\n",
    "- **reward plotting**: Visualization of the reward function is an essential step in evaluating the effectiveness and training progress of RL agents.\n",
    "\n",
    "- **policy replay**: Flow includes tools for visualizing trained policies using SUMO's GUI. This enables more granular analysis of policies beyond their accrued reward, which in turn allows users to tweak actions, observations and rewards in order to produce some desired behavior. The visualizers also generate plots of observations and a plot of the reward function over the course of the rollout.\n",
    "\n",
    "- **data collection and analysis**: Any Flow experiment can output its simulation data to a CSV file, `emission.csv`, containing the contents of SUMO's built-in `emission.xml` files. This file contains various data such as the speed, position, time, fuel consumption and many other metrics for every vehicle in the network and at each time step of the simulation. Once you have generated the `emission.csv` file, you can open it and read the data it contains using Python's [csv library](https://docs.python.org/3/library/csv.html) (or using Excel)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Visualization is different depending on which reinforcement learning library you are using, if any. Accordingly, the rest of this tutorial explains how to plot rewards, replay policies and collect data when using either no RL library, RLlib, or stable-baselines. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "**Contents:**\n",
    "\n",
    "[How to visualize using SUMO without training](#2.1---Using-SUMO-without-training)\n",
    "\n",
    "[How to visualize using SUMO with RLlib](#2.2---Using-SUMO-with-RLlib)\n",
    "\n",
    "[**_Example: visualize data on a ring trained using RLlib_**](#2.3---Example:-Visualize-data-on-a-ring-trained-using-RLlib)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 2. How to visualize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### 2.1 - Using SUMO without training\n",
    "\n",
    "_In this case, since there is no training, there is no reward to plot and no policy to replay._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Data collection and analysis\n",
    "\n",
    "SUMO-only experiments can generate emission CSV files seamlessly:\n",
    "\n",
    "First, you have to tell SUMO to generate the `emission.xml` files. You can do that by specifying `emission_path` in the simulation parameters (class `SumoParams`), which is the path where the emission files will be generated. For instance:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from flow.core.params import SumoParams\n",
    "\n",
    "sim_params = SumoParams(sim_step=0.1, render=True, emission_path='data')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Then, you have to tell Flow to convert these XML emission files into CSV files. To do that, pass in `convert_to_csv=True` to the `run` method of your experiment object. For instance:\n",
    "\n",
    "```python\n",
    "exp.run(1, convert_to_csv=True)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "When running experiments, Flow will now automatically create CSV files next to the SUMO-generated XML files."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### 2.2 - Using SUMO with RLlib "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Reward plotting\n",
    "\n",
    "RLlib supports reward visualization over the period of the training using the `tensorboard` command. It takes one command-line parameter, `--logdir`, which is an RLlib result directory. By default, it would be located within an experiment directory inside your `~/ray_results` directory. \n",
    "\n",
    "An example call would look like:\n",
    "\n",
    "`tensorboard --logdir ~/ray_results/experiment_dir/result/directory`\n",
    "\n",
    "You can also run `tensorboard --logdir ~/ray_results` if you want to select more than just one experiment.\n",
    "\n",
    "If you do not wish to use `tensorboard`, an other way is to use our `flow/visualize/plot_ray_results.py` tool. It takes as arguments:\n",
    "\n",
    "- the path to the `progress.csv` file located inside your experiment results directory (`~/ray_results/...`),\n",
    "- the name(s) of the column(s) you wish to plot (reward or other things).\n",
    "\n",
    "An example call would look like:\n",
    "\n",
    "`flow/visualize/plot_ray_results.py ~/ray_results/experiment_dir/result/progress.csv training/return-average training/return-min`\n",
    "\n",
    "If you do not know what the names of the columns are, run the command without specifying any column:\n",
    "\n",
    "`flow/visualize/plot_ray_results.py ~/ray_results/experiment_dir/result/progress.csv`\n",
    "\n",
    "and the list of all available columns will be displayed to you."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Policy replay\n",
    "\n",
    "The tool to replay a policy trained using RLlib is located at `flow/visualize/visualizer_rllib.py`. It takes as argument, first the path to the experiment results (by default located within `~/ray_results`), and secondly the number of the checkpoint you wish to visualize (which correspond to the folder `checkpoint_<number>` inside the experiment results directory).\n",
    "\n",
    "An example call would look like this:\n",
    "\n",
    "`python flow/visualize/visualizer_rllib.py ~/ray_results/experiment_dir/result/directory 1`\n",
    "\n",
    "There are other optional parameters which you can learn about by running `visualizer_rllib.py --help`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Data collection and analysis\n",
    "\n",
    "Simulation data can be generated the same way as it is done [without training](#2.1---Using-SUMO-without-training).\n",
    "\n",
    "If you need to generate simulation data after the training, you can run a policy replay as mentioned above, and add the `--gen-emission` parameter.\n",
    "\n",
    "An example call would look like:\n",
    "\n",
    "`python flow/visualize/visualizer_rllib.py ~/ray_results/experiment_dir/result/directory 1 --gen_emission`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 - Example: Visualize data on a ring trained using RLlib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pwd  # make sure you are in the flow/tutorials folder"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The folder `flow/tutorials/data/trained_ring` contains the data generated in `ray_results` after training an agent on a ring scenario for 200 iterations using RLlib (the experiment can be found in `flow/examples/rllib/stabilizing_the_ring.py`).\n",
    "\n",
    "Let's first have a look at what's available in the `progress.csv` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python ../flow/visualize/plot_ray_results.py data/trained_ring/progress.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This gives us a list of everything that we can plot. Let's plot the reward and its boundaries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib notebook\n",
    "# if this doesn't display anything, try with \"%matplotlib inline\" instead\n",
    "%run ../flow/visualize/plot_ray_results.py data/trained_ring/progress.csv \\\n",
    "episode_reward_mean episode_reward_min episode_reward_max"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the policy had already converged by the iteration 50.\n",
    "\n",
    "Now let's see what this policy looks like. Run the following script, then click on the green arrow to run the simulation (you may have to click several times)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python ../flow/visualize/visualizer_rllib.py data/trained_ring 200 --horizon 2000"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The RL agent is properly stabilizing the ring! \n",
    "\n",
    "Indeed, without an RL agent, the vehicles start forming stop-and-go waves which significantly slows down the traffic, as you can see in this simulation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python ../examples/simulate.py ring"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the trained ring folder, there is a checkpoint generated every 20 iterations. Try to run the second previous command but replace 200 by 20. On the reward plot, you can see that the reward is already quite high at iteration 20, but hasn't converged yet, so the agent will perform a little less well than at iteration 200."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's it for this example! Feel free to play around with the other scripts in `flow/visualize`. Run them with the `--help` parameter and it should tell you how to use it. Also, if you need the emission file for the trained ring, you can obtain it by running the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!python ../flow/visualize/visualizer_rllib.py data/trained_ring 200 --horizon 2000 --gen_emission"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The path where the emission file is generated will be outputted at the end of the simulation."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
